Maintenance
Restart R session in Rstudio
Use keyboard shortcut Ctrl + Shift + F10
Load the libraries
set.seed(1234)
if (!require("tidyverse")) install.packages("tidyverse"); library("tidyverse")
if (!require("sf")) install.packages("sf"); library("sf")
if (!require("tmap")) install.packages("tmap"); library("tmap")
if (!require("spdep")) install.packages("spdep"); library("spdep")
if (!require("sjmisc")) install.packages("sjmisc"); library("sjmisc")
tmap_mode("view")
Load the data
# prepared earlier
SA2_SEIFA <- readRDS("data/SA2_SEIFA.Rds")
From sf to sp
So far we worked with spatial objects represented by simple features from sf library. Long before sf library appeared R used different representation of spatial objects provided by sp library. Many packages might still depend on it so we will learn how to transform the data between then.
One notable example is spdep package that offers very powerful se tof tools to analyse spatial patterns of data.
We will reproject our SA2 & SEIFA data prepared before and convert it to sp object using as function:
SA2_SEIFA_proj <- SA2_SEIFA %>%
st_transform(3112) %>%
filter(!is.na(IRSAD_s)) %>%
as("Spatial")
Our object is now of a different class:
## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"
Because of that it has different properties and will behave differently. For instance. We cannot simply view the data associated with polygons:
Instead of that we have to access so called data “slot” by using @ notation:
View(SA2_SEIFA_proj@data)
Luckily, tmap library can work with both sf and sp objects so all we have learned so far can be still used. Try creating a quick map (qtm function) with SA2_SEIFA_proj.
Spatial neighbours
In order to examine relationship between regions we have to define how we would like to define who is included as a neighbour. One common approach is to define “queen” type of neighbouring (queen being relation to the queen’s movement in chess). We can create neighbours list from polygon list with poly2nb function:
(SA2_weights_queen <- poly2nb(SA2_SEIFA_proj, queen = TRUE))
## Neighbour list object:
## Number of regions: 305
## Number of nonzero links: 1764
## Percentage nonzero weights: 1.896264
## Average number of links: 5.783607
We can now turn this object into spatial weights (listw class) that are required by some tools:
(SA2_weights_queen_list <- nb2listw(SA2_weights_queen, style="W", zero.policy=TRUE))
## Characteristics of weights list object:
## Neighbour list object:
## Number of regions: 305
## Number of nonzero links: 1764
## Percentage nonzero weights: 1.896264
## Average number of links: 5.783607
##
## Weights style: W
## Weights constants summary:
## n nn S0 S1 S2
## W 305 93025 305 111.2742 1255.536
Picture might give it better overview than description - let’s plot the object to see how polygons are connected:
plot(SA2_SEIFA_proj, border = "grey60")
plot(SA2_weights_queen, coordinates(SA2_SEIFA_proj), pch = 19, cex = 0.6, add = TRUE)

We now defined all polygons that touch any part of their boundary to become neighbours. We will now work with statistics that use this neighbours to determine degree of global and local clustering.
Global spatial clustering
In order to check if our data is clustered in space we could use Moran’s I test for spatial autocorrelation. moran.test function from spdep package will perform that:
(SA2_IRSAD_s_Moran <- moran.test(SA2_SEIFA_proj$IRSAD_s, listw = SA2_weights_queen_list))
##
## Moran I test under randomisation
##
## data: SA2_SEIFA_proj$IRSAD_s
## weights: SA2_weights_queen_list
##
## Moran I statistic standard deviate = 17.232, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic Expectation Variance
## 0.586990113 -0.003289474 0.001173436
Interrpretation of the statistic is as follow:
Values of I usually range from −1 to +1. Values significantly below -1/(N-1) indicate negative spatial autocorrelation and values significantly above -1/(N-1) indicate positive spatial autocorrelation.
The significance of the statistic can be obtained from the Monte Carlo permutation test:
(SA2_IRSAD_s_Moran_MC <- moran.mc(SA2_SEIFA_proj$IRSAD_s,
SA2_weights_queen_list,
nsim = 999))
##
## Monte-Carlo simulation of Moran I
##
## data: SA2_SEIFA_proj$IRSAD_s
## weights: SA2_weights_queen_list
## number of simulations + 1: 1000
##
## statistic = 0.58699, observed rank = 1000, p-value = 0.001
## alternative hypothesis: greater
We can also get some insights into spatial patterning of the data by using Moran scatterplot, which plots data against its spatially lagged values,
moran.plot(SA2_SEIFA_proj$IRSAD_s, listw = nb2listw(SA2_weights_queen),
xlab = "ISRAD score", ylab = "Lag of ISRAD score")

Data points are divided into four “quadrants” depending on where they lie in respect to their values to lagged values relation. For instance top-right quadrant contains higher values of variable surrounded also by neighboring areas with higher values as well. That could point to spatial clustering. Bottom-right quadrant on the other hand points to high values of variable surrounded by areas with lower values - indication for spatial outliers. We will use these quadrants and visualize them on maps.
Local spatial clustering
Moran’s I tells us if our data are globally clustered without indicating where exactly it is happening. We can learn about location of clusters and outliers by using local indicators of spatial association (LISA).
Data preparation
We first prepare our data by creating a standardized version of it using scale function.
SA2_SEIFA_proj$IRSAD_s_std <- scale(SA2_SEIFA_proj$IRSAD_s)
We also prepare spatially lagged variable using spatial weights we created above.
SA2_SEIFA_proj$IRSAD_s_lag <- lag.listw(SA2_weights_queen_list, SA2_SEIFA_proj$IRSAD_s_std)
We can create a variable with four classes depending on which quadrant of the Moran’s plot the region lies on the scaterplot of standardized versus lagged values. Four possible conditions are:
SA2_SEIFA_proj$quad <- NA
SA2_SEIFA_proj@data[SA2_SEIFA_proj$IRSAD_s_std > 0 & SA2_SEIFA_proj$IRSAD_s_lag > 0, "quad"] <- 1
SA2_SEIFA_proj@data[SA2_SEIFA_proj$IRSAD_s_std < 0 & SA2_SEIFA_proj$IRSAD_s_lag < 0, "quad"] <- 2
SA2_SEIFA_proj@data[SA2_SEIFA_proj$IRSAD_s_std > 0 & SA2_SEIFA_proj$IRSAD_s_lag < 0, "quad"] <- 3
SA2_SEIFA_proj@data[SA2_SEIFA_proj$IRSAD_s_std < 0 & SA2_SEIFA_proj$IRSAD_s_lag > 0, "quad"] <- 4
Now we are ready to map our results. Each region belongs to one quadrant:
##
## # x <numeric>
## # total N=305 valid N=305 mean=1.93 sd=0.97
##
## val frq raw.prc valid.prc cum.prc
## 1 125 40.98 40.98 40.98
## 2 106 34.75 34.75 75.74
## 3 45 14.75 14.75 90.49
## 4 29 9.51 9.51 100.00
## <NA> 0 0.00 NA NA
We can map it into four categories giving it descriptive names and colours matching clusters (reds) and outliers (blues):
tm_shape(SA2_SEIFA_proj) +
tm_fill(col = "quad", alpha = 0.5, style = "cat",
palette = c("red", "blue", "lightpink", "skyblue2"),
labels = c("High-High", "Low-Low", "High-Low", "Low-High"),
title = "Local Moran's I")
LISA
LISA analysis can be performed using localmoran function from spdep package. We use variable of interest (IRSAD_s score) and listw object with spatial weights of neighbours:
SA2_IRSAD_s_lm <- localmoran(SA2_SEIFA_proj$IRSAD_s, listw = SA2_weights_queen_list)
We obtain an object of localmoran class that stores five results:
## Ii E.Ii Var.Ii
## Min. :-1.089355 Min. :-0.003289 Min. :0.07981
## 1st Qu.: 0.007243 1st Qu.:-0.003289 1st Qu.:0.13914
## Median : 0.251212 Median :-0.003289 Median :0.16287
## Mean : 0.586990 Mean :-0.003289 Mean :0.18720
## 3rd Qu.: 0.958551 3rd Qu.:-0.003289 3rd Qu.:0.19610
## Max. : 4.996230 Max. :-0.003289 Max. :0.49514
## Z.Ii Pr(z > 0)
## Min. :-2.45254 Min. :0.00000
## 1st Qu.: 0.02124 1st Qu.:0.01074
## Median : 0.60875 Median :0.27134
## Mean : 1.44160 Mean :0.28056
## 3rd Qu.: 2.29933 3rd Qu.:0.49153
## Max. :15.24860 Max. :0.99291
The values here indicate:
Ii - local moran statistic E.Ii - expectation of local moran statistic Var.Ii - variance of local moran statistic Z.Ii - standard deviate of local moran statistic Pr() - p-value of local moran statistic
We can now extend quadrant assignment by additionally marking results significant depending on p-value obtained from the localmoran test (stored in the fifth column of results).
SA2_SEIFA_proj$quad_sig <- SA2_SEIFA_proj$quad
SA2_SEIFA_proj@data[SA2_IRSAD_s_lm[, 5] >= 0.05, "quad_sig"] <- 5
Taking itnto account significance of results, we can add one more category to the results for regions that are not significantly different according to permutation test:
frq(SA2_SEIFA_proj$quad_sig)
##
## # x <numeric>
## # total N=305 valid N=305 mean=3.87 sd=1.67
##
## val frq raw.prc valid.prc cum.prc
## 1 51 16.72 16.72 16.72
## 2 47 15.41 15.41 32.13
## 5 207 67.87 67.87 100.00
## <NA> 0 0.00 NA NA
Since we only have first and second category, we modify our palette and labels arguments and also remove colour for insignificat results:
tm_shape(SA2_SEIFA_proj) +
tm_fill(col = "quad_sig", alpha = 0.5, style = "cat",
# palette = c("red", "blue", "lightpink", "skyblue2", "white"),
# labels = c("High-High", "Low-Low", "High-Low", "Low-High", "Not Signif."),
palette = c("red", "blue", NA),
labels = c("High-High", "Low-Low", "Not Signif."),
title = "Local Moran's I")
How does that map compare to original map of IRSAD score and deciles?
Further topics
Construct neighbours using argument queen = FALSE. What differences can you see. In what types of situations would you expect it boe the biggest issue?
Examine global and local autocorrelation of other SEIFA index.